Finding Entities or Information Using Annotations

نویسندگان

  • Rianne Kaptein
  • Jaap Kamps
چکیده

User-generated content often provides more information than just textual data, i.e. tags or annotations are added to label data, in discussions users can comment on the data, and links exist not only between pages, but also between users and annotations. In this paper we explore the use of annotations in the form of categories to find entities or information in Wikipedia. Main differences between entity ranking and ad hoc retrieval are the assessment criteria, and the provision of target categories for entity ranking topics. From analyzing the relevance assessment sets we can see that entity ranking results have more focused categories. The provided target category is however not always the most informative category. Furthermore we show that techniques for entity ranking can also be applied to ad hoc topics and automatically assigned target categories are good surrogates for manually assigned categories. Although using category information leads to larger improvements on entity ranking topics, significant improvements can also be achieved on ad hoc topics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document Annotation Support Using Biomedical Ontologies

Adding annotations to documents by extracting data from text yields richer document representation, which users can exploit for various tasks such as search and browsing. However, data extraction is hard, especially in large-scale heterogeneous settings. A more focused technique for data extraction is entity linking, which does not extract new data from documents, but links words in documents t...

متن کامل

Pocket Knowledge Base Population

Existing Knowledge Base Population methods extract relations from a closed relational schema with limited coverage, leading to sparse KBs. We propose Pocket Knowledge Base Population (PKBP), the task of dynamically constructing a KB of entities related to a query and finding the best characterization of relationships between entities. We describe novel Open Information Extraction methods which ...

متن کامل

VoldemortKG: Mapping schema.org and Web Entities to Linked Open Data

Increasingly, webpages mix entities coming from various sources and represented in different ways. It can thus happen that the same entity is both described by using schema.org annotations and by creating a text anchor pointing to its Wikipedia page. Often, those representations provide complementary information which is not exploited since those entities are disjoint. We explored the extent to...

متن کامل

Profiling of Semantically Annotated Proteins

We have exploited semantic annotations of biological entities to develop a novel approach to infer new knowledge. We demonstrate this in four use cases based on the Gene Expression Ontology, an applied ontology that we developed to serve the needs of researchers involved in the analysis of genes and proteins implicated in transcriptional control of pathways/diseases. We have found that semantic...

متن کامل

Beyond Metadata: Enriching life science publications in Livivo with semantic entities from the linked data cloud

Queries in literature search engines are usually conducted on metadata derived from scientific publications. The search engine LIVIVO holds a corpus of 63 Million life science publications. About 25 Million publications in LIVIVO are taken from PubMed that have annotations with Medical Subject Headings (MeSH). The other publications have heterogeneous keyword annotations. Hence, a workflow is d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009